14 research outputs found

    A Unified Approach To Collaborative Data Visualization

    Get PDF
    Much efforts have lately been concentrated on increasing the precision of recommendations following the Netflix Prize competition. Recently, many researchers and industries have noted that other factors like adequate presentation of the results can add more utility to a recommender system than slight improvement in the precision. In this paper, we suggest a methodology for user-friendly representation of recommendations to the end users. Our scheme unifies the two objectives of prediction and visualization in the core of a unique approach. Users and items are first embedded into a high dimensional latent feature space according to a predictor function, particularly designated to meet visualization requirements. The data is then projected into a 22-dimensional space by Curvilinear Component Analysis (CCA). CCA draws personalized Item Maps (PIMs) representing a small subset of items to the active user. The intra-item semantic correlations are preserved in PIMs which is inherited from the clustering property of the high-dimensional embedding space. Our prediction function and the projection method are both non-linear to increase the clarity of the maps and to limit the effect of projection error. The algorithms are tested on three versions of the MovieLens dataset and the Netflix dataset to show they combine good accuracy with satisfactory visual properties

    Hybrid Weighting Schemes For Collaborative Filtering

    Get PDF
    Neighborhood based algorithms are one of the most common approaches to Collaborative Filtering (CF). The core element of these algorithms is similarity computation between items or users. It is reasonable to assume that some ratings of a user bear more information than others. Weighting the ratings proportional to their importance is known as feature weighting. Nevertheless in practice, none of the existing weighting schemes results in significant improvement to the quality of recommendations. In this paper, we suggest a new weighting scheme based on Matrix Factorization (MF). In our scheme, the importance of each rating is estimated by comparing the coordinates of users (items) taken from a latent feature space computed through Matrix Factorization (MF). Moreover, we review the effect of a large number of weighting schemes on item based and user based algorithms. The effect of various influential parameters is studied running extensive simulations on two versions of the Movielens dataset. We will show that, unlike the existing weighting schemes, ours can improve the performance of CF algorithms. Furthermore, their cascading capitalizes on each other's improvement

    Les Techniques De Recommandation Et De Visualisation Pour Les Données A Une Grande Echelle

    No full text
    We have witnessed the rapid development of the information technology during the last decade. On one side, processing and stocking capacity of digital devices is increasing constantly thanks to advances in construction methods. On the other side, the interaction between these powerful devices has been made possible through networking technology. As a natural consequence of these progresses, the volume of the data generated in different applications has grown with an unprecedented rate. Consequently, it is becoming increasingly harder for internet users to find items and content matching their needs. Henceforth, we are confronted with new challenges to efficiently process and represent the huge mass of data at our disposal. This thesis is centered around the two axes of recommending relevant content and its proper visualization. The role of the recommender systems is to help users in the process of decision making to find items with relevant content and satisfactory quality among the large set of alternatives existing in the Web. On the other hand, the adequate representation of the processed data is central both for increasing its utility to the end-user and for designing efficient analysis tools. In this presentation, the prevalent approaches to recommender systems and the principal techniques for visualization of data in the form of graphs are discussed. Furthermore, it is shown how some of the same techniques applied to recommender systems can be modified to consider visualization requirements.Nous avons assistĂ© au dĂ©veloppement rapide de la technologie de l'information au cours de la derniĂšre dĂ©cennie. D'une part, la capacitĂ© du traitement et du stockage des appareils numĂ©riques est en constante augmentation grĂące aux progrĂšs des mĂ©thodes de construction. D'autre part, l'interaction entre ces dispositifs puissants a Ă©tĂ© rendue possible grĂące Ă  la technologie de rĂ©seautage. Une consĂ©quence naturelle de ces progrĂšs, est que le volume des donnĂ©es gĂ©nĂ©rĂ©es dans diffĂ©rentes applications a grandi Ă  un rythme sans prĂ©cĂ©dent. DĂ©sormais, nous sommes confrontĂ©s Ă  de nouveaux dĂ©fis pour traiter et reprĂ©senter efficacement la masse Ă©norme de donnĂ©es Ă  notre disposition. Cette thĂšse est centrĂ©e autour des deux axes de recommandation du contenu pertinent et de sa visualisation correcte. Le rĂŽle des systĂšmes de recommandation est d'aider les utilisateurs dans le processus de prise de dĂ©cision pour trouver des articles avec un contenu pertinent et une qualitĂ© satisfaisante au sein du vaste ensemble des possibilitĂ©s existant dans le Web. D'autre part, la reprĂ©sentation correcte des donnĂ©es traitĂ©es est un Ă©lĂ©ment central Ă  la fois pour accroĂźtre l'utilitĂ© des donnĂ©es pour l'utilisateur final et pour la conception des outils d'analyse efficaces. Dans cet exposĂ©, les principales approches des systĂšmes de recommandation ainsi que les techniques les plus importantes de la visualisation des donnĂ©es sous forme de graphes sont discutĂ©es. En outre, il est montrĂ© comment quelques-unes des mĂȘmes techniques appliquĂ©es aux systĂšmes de recommandation peuvent ĂȘtre modifiĂ©es pour tenir compte des exigences de visualisation

    Recommendation and visualization techniques for large scale data

    No full text
    We have witnessed the rapid development of the information technology during the last decade. On one side, processing and stocking capacity of digital devices is increasing constantly thanks to advances in construction methods. On the other side, the interaction between these powerful devices has been made possible through networking technology. As a natural consequence of these progresses, the volume of the data generated in different applications has grown with an unprecedented rate. Consequently, it is becoming increasingly harder for internet users to find items and content matching their needs. Henceforth, we are confronted with new challenges to efficiently process and represent the huge mass of data at our disposal. This thesis is centered around the two axes of recommending relevant content and its proper visualization. The role of the recommender systems is to help users in the process of decision making to find items with relevant content and satisfactory quality among the large set of alternatives existing in the Web. On the other hand, the adequate representation of the processed data is central both for increasing its utility to the end-user and for designing efficient analysis tools. In this presentation, the prevalent approaches to recommender systems and the principal techniques for visualization of data in the form of graphs are discussed. Furthermore, it is shown how some of the same techniques applied to recommender systems can be modified to consider visualization requirements.Nous avons assistĂ© au dĂ©veloppement rapide de la technologie de l'information au cours de la derniĂšre dĂ©cennie. D'une part, la capacitĂ© du traitement et du stockage des appareils numĂ©riques est en constante augmentation grĂące aux progrĂšs des mĂ©thodes de construction. D'autre part, l'interaction entre ces dispositifs puissants a Ă©tĂ© rendue possible grĂące Ă  la technologie de rĂ©seautage. Une consĂ©quence naturelle de ces progrĂšs, est que le volume des donnĂ©es gĂ©nĂ©rĂ©es dans diffĂ©rentes applications a grandi Ă  un rythme sans prĂ©cĂ©dent. DĂ©sormais, nous sommes confrontĂ©s Ă  de nouveaux dĂ©fis pour traiter et reprĂ©senter efficacement la masse Ă©norme de donnĂ©es Ă  notre disposition. Cette thĂšse est centrĂ©e autour des deux axes de recommandation du contenu pertinent et de sa visualisation correcte. Le rĂŽle des systĂšmes de recommandation est d'aider les utilisateurs dans le processus de prise de dĂ©cision pour trouver des articles avec un contenu pertinent et une qualitĂ© satisfaisante au sein du vaste ensemble des possibilitĂ©s existant dans le Web. D'autre part, la reprĂ©sentation correcte des donnĂ©es traitĂ©es est un Ă©lĂ©ment central Ă  la fois pour accroĂźtre l utilitĂ© des donnĂ©es pour l'utilisateur final et pour la conception des outils d'analyse efficaces. Dans cet exposĂ©, les principales approches des systĂšmes de recommandation ainsi que les techniques les plus importantes de la visualisation des donnĂ©es sous forme de graphes sont discutĂ©es. En outre, il est montrĂ© comment quelques-unes des mĂȘmes techniques appliquĂ©es aux systĂšmes de recommandation peuvent ĂȘtre modifiĂ©es pour tenir compte des exigences de visualisation.RENNES1-BU Sciences Philo (352382102) / SudocSudocFranceF

    Data Visualization Via Collaborative Filtering

    Get PDF
    Collaborative Filtering (CF) is the most successful approach to Recommender Systems (RS). In this paper, we suggest methods for global and personalized visualization of CF data. Users and items are first embedded into a high-dimensional latent feature space according to a predictor function particularly designated to conform with visualization requirements. The data is then projected into 2-dimensional space by Principal Component Analysis (PCA) and Curvilinear Component Analysis (CCA). Each projection technique targets a di fferent application, and has its own advantages. PCA places all items on a Global Item Map (GIM) such that the correlation between their latent features is revealed optimally. CCA draws personalized Item Maps (PIMs) representing a small subset of items to a specifi c user. Unlike in GIM, a user is present in PIM and items are placed closer or further to her based on their predicted ratings. The intra-item semantic correlations are inherited from the high-dimensional space as much as possible. The algorithms are tested on three versions of the MovieLens dataset and the Netflix dataset to show they combine good accuracy with satisfactory visual properties. We rely on a few examples to argue our methods can reveal links which are hard to be extracted, even if explicit item features are available

    Energy Models for Drawing Signed Graphs

    Get PDF
    Graph drawing is the pictorial representation of graphs in a multi-dimensional space. Energy models are the prevalent approach to graph drawing. In this paper, we propose energy models for drawing signed unidirectional graphs where edges are labeled either as positive (attractive) or as negative (repulsive). The existent energy models do not discriminate against edge sign. Hence, they do not lend themselves to drawing signed graphs. We suggest a general equation for signed energy models by proposing a dual energy model for graphs containing uniquely negative edges, and combining it linearly with the primary model. We then concentrate on revealing the community structure of social network graphs (sociograms) where edge sign represents the state of relationship between two individuals. In this goal, Signed LinLog model is built based on LinLog model whose clustering properties for unsigned graphs is already known. The properties of Signed LinLog model are outlined analytically, and its synthetic and real layouts are presented

    Addressing Sparsity in Decentralized Recommender Systems through Random Walks

    Get PDF
    Abstract. The need for efficient decentralized recommender systems has been appreciated for some time, both for the intrinsic advantages of decentralization and the necessity of integrating recommender systems into existing P2P applications. On the other hand, the accuracy of recommender systems is often hurt by data sparsity. In this paper, we compare different decentralized user-based and item-based Collaborative Filtering (CF) algorithms with each other, and propose a new user-based random walk approach customized for decentralized systems, specifically designed to handle sparse data. We show how the application of random walks to decentralized environments is different from the centralized version. We examine the performance of our random walk approach in different settings by varying the sparsity, the similarity measure and the neighborhood size. In addition, we introduce the popularizing disadvantage of the significance weighting term traditionally used to increase the precision of similarity measures, and elaborate how it can affect the performance of the random walk algorithm. The simulations on MovieLens 10,000,000 ratings dataset demonstrate that over a wide range of sparsity, our algorithm outperforms other decentralized CF schemes. Moreover, our results show decentralized user-based approaches perform better than their item-based counterparts in P2P recommender applications.

    Application of Random Walks to Decentralized Recommender Systems

    Get PDF
    International audienceThe need for efficient decentralized recommender systems has been appreciated for some time, both for the intrinsic advantages of decentralization and the necessity of integrating recommender systems into P2P applications. On the other hand, the accuracy of recommender systems is often hurt by data sparsity. In this paper, we compare different decentralized user-based and item-based Collaborative Filtering (CF) algorithms with each other, and propose a new user-based random walk approach customized for decentralized systems, specifically designed to handle sparse data. We show how the application of random walks to decentralized environments is different from the centralized version. We examine the performance of our random walk approach in different settings by varying the sparsity, the similarity measure and the neighborhood size. In addition, we introduce the \textit{popularizing} disadvantage of the significance weighting term traditionally used to increase the precision of similarity measures, and elaborate how it can affect the performance of the random walk algorithm. The simulations on MovieLens 10,000,000 ratings dataset demonstrate that over a wide range of sparsity, our algorithm outperforms other decentralized CF schemes. Moreover, our results show decentralized user-based approaches perform better than their item-based counterparts in P2P recommender applications

    FlexGD : A Flexible Force-directed Model for Graph Drawing, Inria

    Get PDF
    We propose FlexGD, a force-directed algorithm for straightline undirected graph drawing. The algorithm strives to draw graph layouts encompassing from uniform vertex distribution to extreme structure abstraction. It is flexible for it is parameterized so that the emphasis can be put on either of the two drawing criteria. The parameter determines how much the edges are shorter than the average distance between vertices. Extending the clustering property of the LinLog model, FlexGD is efficient for cluster visualization in an adjustable level. The energy function of FlexGD is minimized through a multilevel approach, particularly designed to work in contexts where edge length distribution is not uniform. Applying FlexGD on several real datasets, we illustrate both the good quality of the layout on various topologies, and the ability of the algorithm to meet the addressed drawing criteria
    corecore